Data Consistency of Two National Registries in Iran: A Preliminary Assessment to Health Information Exchange

Background: The National Spinal Cord Injury Registry of Iran (NSCIR-IR) and the National Trauma Registry of Iran (NTRI) were established to meet the data needs for research and assessing trauma status in Iran. These registries have a group of patients shared by both registries, and it is expected that some identical data will be collected about them. A general question arises whether the spinal cord injury registry can receive part of the common data from the trauma registry and not collect them independently. Methods: We examined variables captured in both registries based on structure and concept, identified the overlapping period during which both systems recorded data in the same centers and extracted relevant data from both registries. Further, we evaluated the data for any discrepancies in amount or nature and pinpointed the underlying reasons for any inconsistencies. Results: Out of all the variables in the NSCIR-IR database, 18.6% of variables were similar to the NTRI in terms of concept and structure. Although four hospitals participated in both registries, only two (Sina and Beheshti Hospitals) had common cases. Patient names, prehospital intubation, ambulance arrival time, ICU length of stay, and admission time were consistent across both registries with no differences. Other common data variables had significant discrepancies. Conclusion: This study highlights the potential for health information exchange (HIE) between NSCIR-IR and NTRI and serves as a starting point for stakeholders and policymakers to understand the differences between the two registries and work toward the successful adoption of HIE.


Introduction
In accordance with the directives established by the International Organization for Standardization (ISO), electronic health records (EHRs) are defined as computerbased systems designed to archive data regarding an individual's medical condition. 1The clinical environment has been transformed by EHRs.It has simplified clinical workflows and allows for instant access to patient data. 2,35][6][7] Data is considered the "fuel" of eHealth by the World Health Organization (WHO) and it is emphasized that universal health coverage requires EHRs. 8Management of EHRs can be challenging due to fragmented, duplicated, and inconsistent data.To address these challenges, the concept of health information exchange (HIE) has been developed to facilitate the secure and efficient exchange of EHRs between different health systems. 9IE plays a critical role in facilitating timely and efficient healthcare delivery, particularly in cases where patients are receiving care from multiple healthcare providers. 9][12] However, HIE can also be challenging.These challenges can have significant implications for patients' well-being, potentially even leading to life-threatening outcomes. 13everal obstacles come with HIE, such as costly installation expenses, low-quality data, reluctance on the part of patients and health centers due to fears of competition or established procedures at medical facilities, or lack of a uniform format for data transmission.It is crucial to overcome these barriers for HIE to be efficiently implemented. 14,157][18] Similarly, the National Trauma Registry of Iran (NTRI) was launched in 2016 to address the shortcomings of the Hospital Health Information System (HIS) in assessing the trauma status of patients. 19,20ccording to the inclusion criteria of both registries, patients with traumatic spinal cord injury should be registered in both registries with the expectation that their data would match.The study assessed the success of these registries based on a range of factors including the number of matching patients, the consistency of data items, and the coherence of values for patients who were identified in both systems.Furthermore, the study aimed to identify any reasons for inconsistencies or conflicts in the information recorded between the two registries.

Materials and Methods
This study entailed analysis of the structures (format) and concepts of data elements in the registry systems.The data format in this study denotes the predefined method of inputting and storing information in a registry, and a similar concept refers to the recording of identical entity data in two registries regardless of data formats.During the study, four hospitals were collaborating with NSCIR-IR and NTRI.To accomplish this study, we examined the period during which both registries performed case finding and patient registration.We then collected data from the two registries across all four hospitals.To identify common cases in the two registries, we used different identifying factors such as the patient's first and last name (considering spelling variations), national code, referral number, patient record number, and phone number.In order to evaluate the consistency of data values between common patients in two registers, ten cases were randomly selected and analyzed for inconsistencies by two researchers.Inconsistencies and discrepancies were marked, and an independent third observer created a checklist based on the identified type of conflict in the notes.The same two researchers then used the developed checklist to detect data inconsistencies in all common cases.The types of discrepancies for each item were measured.Ultimately, the technical expert panel at the registry center level, in collaboration with registry leaders, assessed the causes and sources of the discrepancies.The classifications used for data inconsistencies or discrepancies between the two registers are presented in Table 1.

Results
A total of 69 conceptual common variables were identified between the two registries, which was equal to 23.71% of the NSCIR-IR variables.In addition, 54 variables were found that have identical concepts and formats.Further details of the common variables are shown in Table 2.The study period for each center and the number of matched registered cases for each center are shown in Table 3.The results showed no commonality between the entries in our registries during the selected period at the two collaborating hospitals; so, we excluded them from further analysis to avoid overlapping bias in the results.In addition, neither center had a complete record of patients.
According to the discussion on the expert panel, the difference in case finding is due to the source of data collection.In contrast to the NTRI, the cases in the NSCIR-IR were obtained from various sources in addition to the list of patients in the emergency department, such as the list of patients admitted to the wards, reports from head nurses and neurosurgery residents, and outpatients who were admitted directly were also recorded.
The nature and percentage of data inconsistency between the two registries and the details are shown in Tables 4 and 5.The most common discrepancies in demographic data were related to occupations, due to differences in registration methods, and birth dates, especially for immigrants and non-Iranian patients due to the sources used for birth dates.
The frequent conflict in contact information was found in the home telephone number and address; most of the differences appeared trivial and were probably due to data Different data entry methods Note: Discrepancy or conflict of data with similar structure and concept was classified into the following categories: (i) missing data values (e.g. the field for the province of the accident location was filled in NSCIR-IR, but in NTRI this field was empty and vice versa), (ii) typing errors (e.g. the field for the telephone number is filled in one register as 0912XXX8872 and in the other as 0912XXX8827), (iii) different data values or what we call data conflict (e.g. the cause of an accident is a fall in one database and a car accident in the other), and (iv) different methods of data entry (e.g.blood pressure is 120 in one registry and 12 in the other).We classified inconsistencies in data elements with similar concepts but different structures (data format) into the following classes: (i) unanswered questions, (ii) different data values or, as we call them, data conflicts, (iii) different levels of detail of the data value (so-called data granularity), e.g. the field for comorbid conditions was filled in with diabetes mellitus and hypertension in one database, whereas only diabetes mellitus was entered in the other database.
entry errors, i.e., errors in one or two digits of the telephone number.However, the importance of contact information should not be underestimated, as it is essential for patients' health status follow-up.In some cases, the phone numbers or addresses were completely different, which the expert panel believed was due to different sources used.
Regarding the occurrence of injuries, including time, cause, location of the incident, the activity of the person at the time of injury, the height of falling, and use of safety equipment, there is not complete consistency between the two registries, mainly due to different data sources and different data entry methods.Prehospital data entry errors were found to be due to the inadequacies of the ambulance forms.Similar to NTRI, the patient or the person accompanying the patient was asked to complete the required sections in the NSCIR-IR registry to fill in missing data in the outpatient form.
A comparison of data entry in the two registries revealed that patient names, prehospital intubation, ambulance arrival time, length of ICU stay, and admission time were recorded in the same way in both registries.
The study found numerous data inconsistencies and conflicts in patient vital signs and immobilization, cardiac arrests, and respiratory status in the prehospital data values.However, in some cases, prehospital information was recorded only in the NSCIR-IR; both registries missed important prehospital information because of missing data in the data source.The quality of data in hospitals also varied.Data quality in the level of centers in each registration system, are shown in Table 5.Nevertheless, data quality also varied in the level of centers in each registration system.Sonsilphong et al classified and described the various types of data-level conflicts.The concept of different data values in our research is in concordance with Sonsilphong's definition of data value conflict, which means semantically equivalent data elements with different inputted values. 21he concept of different data input in our study matches the concept of data format conflict in Sonsilphong and colleagues' study, defined as the different format of entered values for semantically equivalent data elements. 21ata scaling conflict was another type of conflict used by Sonsilphong et al to refer to different scales or units of measurement for semantically equivalent data items.The present study equally considered this type of conflict.However, no data were found to have the potential for this type of discrepancy.Another observed conflict that was not mentioned by Sonsilphong et al classification was different data details, which was observed only for unstructured data items in the present study.
A systematic review by Eden et al. examined barriers and factors to HIE and divided them into three general groups: (1) Completeness of data, (2) the internal organization of each system, and (3) the technology, and preliminary goals and benefits of each system. 15n this study, we evaluated only the completeness and structure of the data.Four main identified barriers related to data completeness in this study were:

Different Case Finding Rates in the Two Registries
Case finding involves identifying (i.e.capturing) eligible patients from existing sources using a defined case definition. 22As components of the case-finding method, the location of case-finding (emergency department versus specialty department) and the source of case-finding (list of emergency patients in the last 24 hours in the NTRI versus list of emergency patients and frequent visits to the wards and ICU, and registrars' interactions with residents and ward managers in the NSCIR-IR) differed between the two registries; therefore, the occurrence of differences in case-finding rates or missed cases is inevitable.In addition, it is inevitable that there will be interruptions in case finding because of staff shortages in both registries. 17ecause coverage in both registries is less than ideal, HIE does not seem to have good prospects unless the two registries are merged and a decision is made to use a standard method with better coverage.It should be noted that although more eligible spine trauma cases are identified in NSCIR-IR, according to the results of this study, neither registry has complete coverage.
It is well known that completeness of case ascertainment is a critical quality criterion of registries and surveillance studies.4][25] However, the present study did not include an assessment of the quality of case ascertainment because assessing the completeness of case finding in NTRI or NSCIR-IR was not the aim of the present study.

Missing Data
The results of the present study show that there are missing values for several variables in both registers.In both registries, there is a large number of missing data due to documentation deficiencies in the outpatient reports.The number of variables with missing values was higher in the NTRI.According to the expert panel, this could be due to the NTRI's higher dependence on the HIS system, whereas in the NSCIR-IR the registrar is more active in data collection.Therefore, a change in the current data collection process, including data extraction from the NTRI, could affect data quality in the NSCIR-IR.In the eHealth and EHR ecosystem, data sharing between different systems is seen as a means to reduce missing data in information systems. 26It is illogical to combine two databases, which would increase the missing data. 27According to previous works, differences in the mandatory or optional nature of data elements between different databases and health care information systems are another challenge to electronic clinical data exchange. 27,28

Differences in Data Values (Data Value Conflict)
Based on studies and technical frameworks, data value conflicts pose a significant challenge to heterogeneous database integration and interoperability. 29Although previous work acknowledges the creation of ontologies as a solution for identifying and resolving value conflicts, 30,31 the challenges of determining the correct data to resolve a conflict have not been addressed.According to this study, it cannot be properly assessed which register is the more accurate one.Improving the quality of data entry should be a priority for both registries.

Differences in Detail
According to the results presented in Table 3, the two registries differed in the details of the entered data for seven common variables, including the vehicle type EMS (air/ground) which in NTRI was separated, but it was not separated in NSCIR-IR; patient immobilization in the prehospital phase in NSCIR-IR, which was recorded as collar or board only; occupation as the occupational category in the trauma registry and the exact occupational title in NSCIR-IR; differences in address and telephone number details; province and city of residence and location of injury.Although this is considered a discrepancy, the difference in detail between the two registries is due to the purpose of the registries rather than a difference in data quality.In each case, the policy committees of the two registries should decide whether it is necessary to standardize other variables and make significant changes due to differences in protocols and registration methods.

Limitations
It is important to note that the study had some limitations.Specifically, it only examined the quality and structure of the registries' data, without a comprehensive evaluation of the barriers that could limit the successful HIE.Additionally, there is no consensus on the acceptable level of difference in data to consider a successful HIE, as this varies depending on the specific goals of each registry and its users.

Conclusion
This study showed that although a group of common patients are registered in both registries, in order to perform successful HIE, a wide range of changes are needed, including refinement of the variables, which may not fit the main purpose of the establishment of each registry.Our study serves as a starting point for policymakers and stakeholders to better understand the distinctions between NSCIR-IR and NTRI to make future decisions.

Table 1 .
Type of common variables and potential inconsistencies in each group

Table 2 .
Number of Similar Data Elements (Variables) Between NSCIR-IR and NTRI

NSCIR-IR NTRI Variables with Common Concept Variables with Common Format
2TRI, National Trauma Registry of Iran; NSCIR-IR, National Spinal Cord Injury Registry of Iran.Note: Percentages are calculated in relation to the elements of NSCIR-IR.1163variablesarerelated to ASIA items that are not in the NTRI registry2The NSCIR-IR includes 10 items related to complications and 16 items related to bedsores

Table 3 .
Number of Matched Cases in the Joint Centers in NSCIR-IR and NTRI According to Different Identifiers as Key

Table 4 .
Percentage of Data Inconsistencies for Data Elements Between NSCIR-IR and NTRI for Matched Cases

Table 5 .
Frequency of Types of Disagreement in Two Registry Centers